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f.  Abstract 

One  aspect  of  fault-tolerance  in  process  control  programs  is  the 
ability  to  tolerate  sensor  failure.  This  paper  presents  a  methodology 
for  transforming  a  process  control  program  that  cannot  tolerate  sensor 
failures  into  one  that  can.  Issues  addressed  include  modifying  spec¬ 
ifications  in  order  to  accommodate  uncertainty  in  sensor  values  and 
averaging  sensor  values  in  a  fault-tolerant  manner.  In  addition,  a  hi¬ 
erarchy  of  sensor  failure  models  is  identified,  and  both  the  attainable 
accuracy  and  the  run-time  complexity  of  sensor  averaging  with  respect 
to  this  hierarchy  is  discussed.  '■ 

C Keywords:  fault-tolerance,  process  control  systems,  real-time  dis¬ 
tributed  systems. 


1  Introduction 


A  process  control  program  communicates  and  synchronizes  with  a  physi¬ 
cal  process.  Typically,  the  program  reads  values  from  the  physical  process 
through  sensors  and  writes  values  through  actuators,  as  shown  schematically 
in  Figure  1.  This  paper  is  concerned  with  tolerating  failures  of  continuous¬ 
valued  sensors. 

The  approach  developed  in  this  paper  is  outlined  as  follows: 
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Figure  1:  A  process-control  program 

1.  A  specification  of  the  control  program  is  written  in  terms  of  the  state 
variables  of  the  physical  system.  For  example,  the  specification  of  a 
program  controlling  a  chemical  reaction  vessel  would  refer  to  a  variable 
T  whose  value  is  assumed  to  be  the  temperature  of  the  vessel. 

2.  Each  physical  state  variable  referenced  by  the  specification  is  replaced 
with  a  reference  to  an  abstract  sensor.  An  abstract  sensor  is  a  set 
of  values  that  contains  the  physical  variable  of  interest.  Uncertainty 
in  sensor  values  now  becomes  an  issue,  and  the  specification  must  be 
re-examined  and  possibly  changed  to  accommodate  it. 

3.  The  control  program  is  written  based  on  the  specification  produced  by 
Step  2.  This  program  reads  abstract  sensors  that  are  assumed  to  al¬ 
ways  contain  the  correct  value  of  the  corresponding  physical  variables. 

4.  For  each  abstract  sensor  referenced  by  the  program  written  in  Step  3, 
a  set  of  abstract  sensors  that  fail  independently  are  constructed.  Each 
abstract  sensor  is  implemented  using  a  concrete  sensor ,  which  is  a 
physical  device  that  “reads”  a  physical  variable1 ,  such  as  a  thermr 
meter.  This  step  will  require  some  knowledge  of  the  physical  process 
being  controlled  as  well  as  the  specification  of  the  concrete  sensor. 

5.  A  fault-tolerant  averaging  algorithm  is  used  with  these  rep’kated  ab¬ 
stract  sensor  values  in  order  to  calculate  another  abstract  sensor  that 

lThe  concrete  sensor  need  not  sense  the  exact  physical  state  variable  of  interest.  For 
example,  an  abstract  temperature  sensor  could  be  constructed  from  a  pressure  gauge  by 
using  Boyle’s  law:  PV  =  nRT. 
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is  correct  even  if  some  of  the  original  sensors  are  incorrect.  The  av¬ 
eraging  algorithm  assumes  that  no  more  than  /  out  of  the  n  abstract 
sensors  are  incorrect,  where  /  is  a  parameter.  The  relation  between  n 
and  /  depends  on  the  way  sensors  can  fail. 

The  resulting  system  will  have  a  structure  like  that  shown  in  Figure  2. 

The  rest  of  the  paper  is  organized  as  follows.  In  Section  2,  we  define  a 


abstract  concrete 

sensor  sensor 


Figure  2:  Replicated  sensors 

method  of  representing  sensors  that  makes  them  amenable  to  replication  and 
discuss  the  effect  of  uncertainty  on  process  control  program  specifications. 
In  Section  3,  we  discuss  sensor  failure  models  and  present  a  sensor  averaging 
algorithm.  Section  4  contains  a  demonstration  of  our  methodology. 

2  Physical  State  Variables  and  Concrete  Sensors 

A  variable  in  a  computer  is  quite  different  from  a  state  variable  in  a  physical 
process.  A  computer  variable  takes  on  values  from  a  finite  domain,  and 
can  assume  only  a  bounded  number  of  values  in  any  finite  time  period. 
A  physical  state  variable,  however,  may  take  on  any  real  value  at  arbitrary 
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times.  A  convenient  way  to  represent  a  physical  state  variable  in  a  computer 
program  is  as  a  function.  The  domain  of  such  a  function  is  typically  time,  but 
it  can  be  some  other  physical  variable,  depending  on  the  safety  properties 
of  interest. 

A  concrete  sensor  is  a  device  that  can  be  used  to  sample  a  physical  state 
variable.  For  example,  a  computer  controlling  a  reaction  vessel  might  have  a 
thermometer  as  a  concrete  sensor.  A  concrete  sensor  may  interact  with  the 
computer  in  a  variety  of  way:  the  computer  may  poll  the  sensor,  the  sensor 
may  asynchronously  alert  the  computer  when  a  certain  value  is  sensed,  or 
the  sensor  may  send  a  stream  of  values  to  the  computer  where  each  value 
indicates  that  the  physical  variable  has  changed  by  a  certain  amount.  We 
will  assume  that  a  concrete  sensor  a  has  a  specification  ,  and  will  call  this 
sensor  faulty  if  it  exhibits  a  behavior  not  consistent  with  its  specification. 

For  example,  consider  a  thermometer  whose  value  is  read  by  polling. 
Suppose  this  concrete  sensor  returns  a  value  T  with  an  accuracy  of  c  de¬ 
grees  and  the  computer  obtains  the  sensor’s  value  within  6  seconds  of  the 
thermometer  being  sampled.  If  the  time  the  computer  program  receives  T 
is  t,  then  the  specification  of  this  thermometer  is: 

$(f,t)  =  3t0  :  t  -  6  <  t0  <  t  :  f  -  «/2  <  T{t0)  <  f  +  e/2 

A  concrete  sensor  is  not  very  convenient  mechanism.  For  example,  with 
the  thermometer: 

•  The  sensor  has  a  limited  accuracy.  Network  delay  and  processor 
scheduling  further  limit  the  accuracy  of  the  sensor. 

•  The  control  program  may  be  interested  in  a  temperature  at  a  time  the 
thermometer  was  not  sampled.  A  value  must  then  be  interpolated; 
doing  so  requires  knowledge  of  the  physical  process  being  monitored. 

•  Some  properties  of  the  concrete  sensor,  while  important  to  the  imple¬ 
mentation,  should  be  irrelevant  to  the  specification  used  by  the  process 
control  program.  For  example,  another  thermometer  might  generate 
an  interrupt  if  the  temperature  rises  above  100  degrees.  This  is  an  im¬ 
portant  property  of  the  sensor-it  allows  for  an  accurate  determination 
of  when  100  degrees  is  reached.  There  may  be  other  way?  to  make  the 
same  kind  of  precise  measurement,  however,  for  a  sensor  that  is  polled. 
It  would  be  convenient  if  the  control  program  could  be  the  same  for 
any  method  of  measurement,  as  long  as  the  measurement  is  accurate 
enough. 
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We  will  address  these  difficulties  in  two  ways.  The  first  problem  cannot 
be  eliminated,  so  in  Section  2.2  the  effect  of  inaccuracy  in  specifications  is 
addressed.  The  other  two  problems,  interpolation  and  data  abstraction,  are 
addressed  here  by  abstract  sensors. 

2.1  Abstract  Sensors 

An  abstract  sensor  is  a  piecewise  continuous  function  from  a  physical  state 
variable  to  a  dense  interval  of  real  numbers.  We  will  denote  an  abstract 
sensor  with  an  overbar  over  the  variable,  such  as  T{t).  When  possible,  we 
will  simply  write  T  if  we  are  interested  in  the  “current”  value;  that  is,  the 
sensor  value  for  the  current  value  of  f.  Intuitively,  interval  T  represents  the 
possible  values  of  T ,  given  the  imprecision  of  the  concrete  sensor  used  to 
compute  T  and  any  uncertainty  in  the  physical  process. 

An  abstract  sensor  T{t)  can  be  represented  as  a  pair  of  functions  Tmtn{t) 
and  Tmax(t),  allowing  T(t)  to  be  the  interval  [Tmm(t)  ..  Tmax(t)].  The 
accuracy  of  an  abstract  sensor  is  the  width  of  the  interval,  or  |T(t)|.  With 
this  representation,  min  T(t)  =  Tmtn(t ),  max T(t)  =  Tmax(t),  and  |T(f)|  = 

T max(f)  ~  T m in(f)- 

An  abstract  sensor  T  is  correct  if  it  is  not  too  inaccurate  and  always 
includes  the  value  of  the  actual  physical  variable.  More  precisely,  for  some 
upper  bound  accj  on  the  accuracy  of  T, 

T  correct  over  D  =f 

Vt  e  D  :  min  T(t)  <  T(t)  <  max  T{t)  A  |7(t)|  <  acc^ 

We  assume  that  a  failure  of  an  abstract  sensor  can  arise  when  the  un¬ 
derlying  concrete  sensor  fails.  As  will  be  discussed  in  Section  3,  a  hierarchy 
of  failure  classes  can  be  defined: 

•  fail-stop  failures  (following  [17]),  in  which  a  failed  abstract  sensor  can 
be  detected2; 

•  arbitrary  failures  with  bounded  inaccuracy3,  in  which  either  |T(t)|  < 

3The  value  of  %  failed  fail-stop  sensor  can  be  defined  to  be  the  empty  interval  whose 
value  is  [e  ..  e  —  1]  for  some  value  of  e.  The  empty  interval  has  the  convenient  properties 
that  it  contains  no  points  and  intersects  no  interval,  including  itself. 

*We  use  the  term  bounded  inaccuracy  to  refer  to  bounding  from  above  the  accuracy  of 
an  abstract  sensor.  Similarly,  an  abstract  sensor  is  too  inaccurate  if  the  numeric  value  of 
its  accuracy  is  too  large. 
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accj  is  always  true  or  accj  is  known,  and  thus  abstract  sensors  that 
are  too  inaccurate  can  be  detected; 

•  arbitrary  failures,  in  which  an  abstract  sensor  can  fail  arbitrarily. 

Given  a  concrete  sensor,  it  may  not  be  easy  to  implement  an  abstract 
sensor.  In  general,  it  may  require  considerable  knowledge  about  the  physical 
process  being  monitored.  For  example,  consider  the  specification  <b{T ,  f)  for 
the  polled  thermometer.  The  specification,  alone,  is  not  sufficient  informa¬ 
tion  to  define  an  abstract  sensor  T,  since  we  don’t  know  how  to  interpolate 
values  between  successive  sensor  readings.  Suppose,  however,  we  know  from 
the  physical  process  being  monitored  that  |^|  <  A j.  This  bound  on  the 
change  of  T  allows  us  to  interpolate  intermediate  values  with  a  known  ac¬ 
curacy.  The  abstract  sensor  T(t)  can  be  defined  as 

T-e/2-  A T(t  -  t  +  6)  <  T(t)  <  f  +  e/2  +  Aj (t  -  t  +  6)  for  t  >  i 

One  can  use  this  example  as  a  recipe  for  writing  abstract  sensors,  but  the 
resulting  sensor  may  be  too  inaccurate  for  any  practical  use.  For  example, 
if  |^|  can  be  bound  more  tightly  at  certain  known  times,  a  more  accurate 
sensor  can  be  constructed.  In  Section  4,  the  development  of  an  abstract 
sensor  is  shown  in  some  detail. 

2.2  Abstract  Sensors  in  Specifications 

The  specification  of  a  system  typically  includes  a  set  of  safety  conditions: 
predicates  on  the  state  of  the  system  that  the  implementation  must  ensure 
axe  always  true.  A  safety  condition  on  a  process  control  program  will  refer¬ 
ence  physical  state  variables.  For  example,  consider  a  reaction  vessel  with 
a  pressure  relief  valve.  One  safety  condition  might  be  that  whenever  the 
pressure  p  is  greater  than  some  ceiling  pmox,  the  valve  must  be  open.  We 
could  write  this  safety  condition  as  p  >  p^ax  =  open,  where  open  is  a  state 
function  that  is  true  when  the  valve  is  open. 

The  specification  of  a  process  control  program  will  have  to  be  changed 
when  expressed  in  terms  of  abstract  sensors.  It  is  not  possible  to  take  a  con¬ 
trol  program  written  in  terms  of  physical  state  variables  and,  for  each  ref¬ 
erence  to  such  a  variable,  substitute  a  reference  to  a  corresponding  abstract 
sensor.  Consider  p  >  pmax  =  open  The  condition  that  results  from  replac¬ 
ing  the  physical  state  variable  with  an  abstract  sensor  is  p  >  pmax  =  open; 
one  must  decide  what  the  term  p  >  pmax  means. 
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Let  5  be  a  predicate  on  the  system  state  and  V  be  the  set  of  physical 
variables  mentioned  in  5  that  will  be  accessed  through  abstract  sensors.  We 
need  another  condition  S'  that  contains  no  references  to  any  c,  £  V  but 
may  instead  contain  references  to  t;,.  The  only  constraint  on  S'  is  that  it 
reduces  to  S  when  the  abstract  sensors  have  perfect  accuracy  4: 

(S'  A  |U,j  =  0)  ^  Sv_' 

Vi 

There  are  several  ways  such  an  S'  can  be  constructed.  We  could  replace 
all  references  to  r,  in  S  with  references  to  the  midpoint  of  v;.  However,  if  all 
values  in  t Tj  have  the  same  likelihood  of  being  valid,  then  there  are  only  two 
reasonable  alternatives.  We  can  either  require  that  all  points  in  u,  satisfy  S 
or  that  there  exists  at  least  one  point  in  p,  that  satisfies  S.  More  precisely, 
for  each  physical  variable  vt  the  condition  5  can  be  generalized  as 

S'  d=  Vu,  €  v,  :  S  or  S'  =f  3v,  6  u,  :  5 

The  generalization  of  5  cannot  be  done  automatically,  since  it  is  really  a 
refinement  of  the  problem  specification.  Ideally,  one  would  like  to  strengthen 
S  so  that  states  excluded  by  the  safety  condition  are  still  excluded.  For 
example,  we  might  want  to  assert  that  a  catalyst  is  injected  (denoted  by 
the  state  function  C)  only  when  the  pressure  is  above  a  minimum  value: 
C  =>  (p  >  Pmin ) ■  In  this  case,  the  state  we  are  trying  to  avoid  is  one 
where  the  catalyst  is  injected  at  too  low  a  pressure,  and  we  can  strengthen 
C  =>  (p  >  Pmin)  to  C  =►  (Vp  €  j5  •'  P  >  Pmifi)- 

We  may  find,  however,  that  a  specification  cannot  be  strengthened  in  a 
meaningful  way.  The  property  p  >  pmax  =  open  is  an  example.  Changing 
the  property  to  (Vp  £  f  :  p  >  pmai)  =  °Pen  allow  states  with  p  >  pmax 
and  -i open,  and  changing  the  specification  to  (3 p  £  p  :  p  >  pmax)  =  open 
will  allow  states  with  p  <  pmax  and  open.  Unless  we  can  guarantee  that 
|p|  =  0,  the  program’s  specification  must  be  changed.  Here,  we  are  probably 
more  interested  in  avoiding  an  explosion  of  the  vessel.  If  so,  the  condition 
we  want  is  (3p  £p:p>  pmax)  =  open,  and  we  would  accept  the  fact  that 
the  pressure  valve  may  be  unnecessarily  open. 

It  shouldn’t  be  surprising  that,  in  some  cases,  a  property  of  a  specifica¬ 
tion  must  be  changed  (as  compared  to  being  strengthened)  when  references 

’The  -  rpresaion  5  ’’  ia  S  with  all  occurrences  of  physical  state  variable  v,  changed  to 
abstract  sensor  v,. 
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to  physical  state  variables  are  replaced  with  references  to  abstract  sensors. 
Using  abstract  sensors  exposes  uncertainty  in  the  physical  process’  state  and 
a  specification  may  have  been  written  implicitly  assuming  no  such  uncer¬ 
tainty.  Of  course,  specifications  are  sometimes  written  with  such  uncertainty 
explicitly  mentioned.  For  example,  an  informal  expression  of  the  pressure 
relief  valve  property  might  be  “if  the  pressure  rises  to  within  0.1  millibars  of 
Pmaz  then  the  relief  valve  must  open”.  In  our  notation,  this  property  would 
be  expressed  as  ((3p  €  p  :  p  >  pmax)  =  open)  A  (|p|  <  0.1). 

3  Fault-Tolerant  Abstract  Sensors 

Given  n  independent  abstract  sensors  and  some  assumptions  about  failures, 
we  would  like  to  construct  an  abstract  sensor  that  is  tolerant  of  failures.  We 
will  first  present  an  algorithm  that  constructs  a  sensor  containing  the  correct 
value  given  that  no  more  than  /  of  the  original  sensors  are  not  correct.  We 
will  then  consider  how  this  algorithm  performs  with  different  failure  models. 

3.1  Fault-Tolerant  Sensor  Averaging 

Let  Ti  and  T }  (t  ^  j)  be  two  abstract  sensors  for  the  same  physical  value 
T.  If  T,  and  Tj  both  contain  the  correct  value,  then  the  intervals  7\  and  Tj 
must  intersect,  and  their  intersection  must  contain  the  (unknown)  value  T. 

If  /  or  less  sensors  do  not  contain  the  correct  value,  then  any  (n  -  /)- 
clique ,  or  set  of  n  -  /  mutually  intersecting  sensors  may  contain  the  correct 
value,  since  they  each  share  a  common  value.  Conversely,  any  point  not 
contained  in  at  least  n  -  /  intervals  cannot  be  the  correct  value;  if  it  were, 
then  there  would  be  more  than  /  sensors  that  do  not  contain  the  correct 
value.  So,  the  cover  of  all  (n  -  /)-cliques  must  contain  the  correct  value. 
This  gives  us  an  abstract  sensor  averaging  algorithm. 

Algorithm  1  Fault-tolerant  Sensor  Averaging 

Let  S  be  a  set  of  values  taken  from  n  abstract  sensors,  and  sup¬ 
pose  the  abstract  sensors  are  of  the  same  physical  state  variable 
where  their  values  were  read  at  the  same  point  in  their  domain 
( e.g .  at  the  same  time).  Assuming  that  at  most  /  of  these  sen¬ 
sors  are  incorrect,  calculate  D /,„(5)  which  is  the  smallest  interval 
that  is  guaranteed  to  contain  the  correct  physical  value. 


8 


Implementation:  Let  /  be  the  smallest  value  contained  in  at  least 
n  —  /  of  the  intervals  in  S  and  h  be  the  largest  value  contained  in 
at  least  n  -  /  of  the  intervals  in  S  (by  assumption,  these  values 
must  exist).  Let  0/n(5)  be  the  interval  l  ..  /i]. 

Algorithm  1  is  inexpensive-it  can  be  implemented  in  0(n\ogn )  time. 
Appendix  B  gives  an  implementation  that  has  this  running  time. 

The  accuracy  of  n/,„(5)  depends  on  the  value  of  /,  as  illustrated  in 
Figure  3.  In  this  example,  the  value  of  no,n(«5)  is  the  empty  interval  because 
it  is  impossible  for  both  intervals  a  and  b  to  contain  the  correct  value;  at 
least  one  of  them  must  be  incorrect.  In  general  and  when  defined,  n0,„(5)  is 
the  intersection  of  the  intervals  in  5,  Dn_iin(5)  is  the  cover  of  the  intervals 
in  5,  and  |  n/t„  (S)|  <  |  n/(>n  (S)|  if  /  <  /'. 


a  - - ►  -« - ►  b 

c  -« — - — 

- - -  d 

e  4-  . .  - - -  » 


- - -  nlf5 

-4  +■  n2,5 

- -  n3,5 

*  *  ru.s 

Figure  3:  Intersection  with  /  =  1,2,3  and  4 

One  consequence  of  the  definition  of  H/in(5)  is  that  for  /  >  0,  n^„(5) 
can  contain  values  that  cannot  be  the  correct  value.  For  example,  Figure  4 
shows  the  intersection  of  three  intervals  a,  b  and  c.  If  /  =  1  then  the  correct 
value  must  be  within  71  or  12.  Algorithm  1,  however,  would  calculate  the 
interval  I.  The  points  between  71  and  72  are  added  to  preserve  the  “shape" 
of  the  abstract  sensor  as  seen  by  the  control  program. 

It  is  instructive  to  compare  n/t„(5)  with  n-modular  redundancy  [20] 
(nmr).  In  nmr,  n  independently  produced  values  of  a  variable  are  presented 
to  a  voter  that  selects  the  majority  value  as  its  output.  By  doing  so,  the 


vorer  can  mask  up  to  /  incorrect  inputs  where  n  >  2/  +  1.  The  function 
Oj  n{S)  resembles  an  NMR  voter,  except  that  it  accepts  intervals  rather  than 
points  as  inputs  and  it  produces  the  most  accurate  value  possible  as  output 
for  any  value  of  /  :  (0  <  /  <  n).  If  the  inputs  to  ny>n(S)  are  point  intervals 
(that  is,  have  a  width  of  zero),  then  the  NMR  voter  and  0/ a(S)  produce  the 
same  output  when  n  >  2/  +  1. 


a 


_ i _ 

I  t 

I  i 
I  I 

11  M 


12 


I 


c 


Figure  4:  Intersection  with  n  =  3  and  /  =  1 

The  relation  of  /  to  n  (and  hence  the  accuracy  of  n^n(S))  depends  on 
the  failure  model  that  is  assumed.  We  will  first  assume  arbitrary  failures 
(both  with  and  without  bounded  inaccuracy)  and  then  consider  a  fail-stop 
failure  model.  We  assume  that  no  more  than  /  of  the  n  sensors  can  be  faulty 
and  that  once  failed,  a  sensor  remains  failed. 

3.2  Arbitrary  Failures 

The  width  of  an  interval  that  is  an  abstract  sensor  value  determines  the 
sensor’s  accuracy.  If  the  ratio  f/n  of  the  number  of  faulty  to  non-faulty 
abstract  sensors  is  too  large,  then  one  cannot  bound  the  inaccuracy  of  the 
resulting  abstract  sensor.  The  following  theorem  bounds  ffn.  Define  the 
functions  min,  and  max,  to  be  the  i,h  smallest  and  largest  values  of  a  set 
of  n  values  respectively.  Note  that  min,  is  the  same  as  maxn_1+i.  For 
example,  if  5  =  {13, 14,15}  then  min3(S)  =  maxi(S)  =  15. 

Theorem  1  If  f  <  [^J  tAen  |ri/,n(5)|  <  min2/+i{|s|  :  s  6  5}. 

The  proof  of  this  theorem  is  in  Appendix  A. 

If  /  >  L(n  +  l)/2]  then  the  derived  interval  can  be  more  inaccurate  than 
any  sensor  in  the  system.  Theorem  4  in  Appendix  A  formally  states  this 
property.  An  example  is  shown  in  Figure  5.  Suppose  the  three  sensors  a,  b 
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and  c  are  “maliciously”  faulty.  They  can  make  ny,„(5)  as  inaccurate  as 
desired  by  choosing  appropriately  distant  values  from  intervals  d  and  e. 


Figure  5:  Intersection  with  n  =  5  and  /  =  3 

One  property  of  n/„(<5)  is  that,  depending  on  the  values  of  5.  n^  „(S) 
can  be  more  accurate  than  any  sensor  in  S.  Figure  6  illustrates  this  property. 
Such  a  value  of  S  can  result  from  different  delays,  errors,  or  other  sources 
of  uncertainty  that  arise  in  computing  the  value  of  the  abstract  sensors 
comprising  S.  This  property  makes  replication  of  abstract  sensors  attractive 
not  only  for  tolerating  failures,  but  also  for  increasing  the  expected  accuracy 
of  a  sensor’s  value. 


a 


d 


e 

M-  »i  I 

I  I 


Figure  6:  Intersection  with  n  =  5  and  /  =  1 

If  n  =  2/  +  1,  however,  then  the  accuracy  of  n/,n(S)  is  limited,  in  that 
it  cannot  be  more  accurate  than  the  most  accurate  sensor  in  S.  This  is 
illustrated  in  Figure  7  where  /  =  1  and  n  =  3;  here,  the  only  way  we  could 
change  sensor  c  so  that  it  contains  values  outside  of  n/,n(5)  would  be  to 
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make  c  more  inaccurate  than  either  a  or  b  or  to  make  c  detectably  faulty  (as 
discussed  in  Section  3.3).  It  is,  therefore,  advantageous  to  have  n  >  2/+1  for 
a  system  with  arbitrary  failures.  Theorems  5  and  6  in  Appendix  A  formally 
state  this  property. 


a 


b 


Figure  7:  Intersection  with  n  =  3  and  /  =  1 

Theorem  1  bounds  the  accuracy  of  a  derived  abstract  sensor  in  terms 
of  the  accuracy  of  one  of  the  abstract  sensors  s,  used  in  its  construction. 
Such  a  bound  is  useful  only  if  li  is  not  faulty — in  particular,  if  |s,|  <  acc j. 
Hence,  Theorem  1  only  applies  for  arbitrary  failures  with  bounded  inac¬ 
curacy.  However,  if  this  bounding  sensor  could  have  an  erroneously  large 
inaccuracy,  then  the  bound  is  not  meaningful.  Consider  the  sensors  shown 
in  Figure  8.  If  sensor  c  is  erroneously  inaccurate,  then  the  value  of  3(5) 
is  as  inaccurate  as  c.  Thus,  the  ratio  //n  of  the  number  of  faulty  to  non- 
faulty  abstract  sensors  must  be  smaller  than  that  stated  in  Theorem  1  when 
sensors  can  have  unbounded  inaccuracy.  Theorem  2  gives  this  bound  on 
//n. 


a 


Figure  8:  Intersection  with  n  =  3  and  /  =  1 
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Theorem  2  LetC  be  the  (unknown)  subset  of  S  that  are  correct.  If }  <  [2J 
then  |  n/>n  (5)|  <  min/+i{|s|  :  s  €  C}. 

The  proof  of  this  theorem  is  simple:  from  Theorem  1, 

|n/,„(5)|  <  max„_2/{Vs  6  S} 

For  |n/,„(5)|  to  be  bounded  by  a  correct  sensor,  n  -  2/  >  /  and  so  n  >  3/. 
The  worst  case  is  when  f  faulty  sensors  are  the  most  inaccurate,  so 

|n/,„(5)j  <  min/+i{|sj  :  s  e  C} 

□ 

Under  the  hypothesis  of  Theorem  2,  a  minimum  of  four  sensors  are  nec¬ 
essary  to  tolerate  a  single  faulty  sensor.  Figure  9  illustrates  this  case — even 
if  sensor  d  has  an  erroneously  large  inaccuracy,  |  n1>4  (5)|  is  bounded  by  a 
nonfaulty  sensor. 


Figure  9:  Intersection  with  n  =  4  and  /  =  1 

3.3  Other  Issues  on  Failure 

If  /'  <  /  sensors  can  be  detected  as  failed  then  they  can  be  removed  from 
5,  and  n  and  /  can  be  reduced  by  f  before  computing  fi/in(5).  By  doing 
so,  the  ratio  ffn  will  be  decreased,  thereby  improving  the  bound  on  the 
inaccuracy  of  n/in(<S).  In  a  fail-stop  failure  model,  all  sensor  failures  are 
detectable,  meaning  that  up  to  n  -  1  failures  can  be  tolerated  and  ny  n(5) 
will  be  as  accurate  as  the  most  accurate  nonfaulty  sensor.  Additionally,  the 
running  time  of  Algorithm  1  with  fail-stop  sensors  is  0(n). 
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We  can  use  Algorithm  1  to  detect  some  failed  abstract  sensors  assuming 
an  arbitrary  failure  model.  This  algorithm  is  very  simple:  any  sensor  in 
S  that  does  not  intersect  fiyin(5)  cannot  contain  the  correct  value,  and  is 
therefore  incorrect. 

Algorithm  2  Detecting  failed  sensors. 

Given  n  sensors  5  and  a  maximum  number  of  faulty  sensors  /, 

find  a  subset  of  the  sensors  V  C  S  that  are  incorrect. 

Implementation:  Compute  D /,„(S)  using  Algorithm  1.  Then. 

D  =  {s:se5Asn  (n/,n(5))  =  0}. 

It  is  likely  that  Algorithm  2  will  fail  to  detect  some  of  the  incorrect 
sensors.  For  example,  using  Algorithm  2  with  the  sensors  in  Figure  4  yields 
V  =  0;  even  though  we  know  that  only  one  of  the  two  sensors  a,  c  must  be 
incorrect,  we  cannot  tell  which  of  the  two  is  incorrect. 

So  far,  we  have  assumed  that  once  a  sensor  fails  it  remains  failed.  This 
assumption  may  not  be  realistic  for  sensors,  since  an  abstract  sensor  main¬ 
tains  no  state.  It  seems  natural  to  assume  a  sensor  may  occasionally  fail  in 
an  apparently  malicious  way  and  then  “heal”  itself  and  subsequently  yield 
correct  values.  So,  a  natural  extension  to  the  arbitrary  failure  model  is  to  de¬ 
note  the  faulty  sensors  at  time  t  as  a  function  lF(t)  such  that  Vf  :  |jF(t)|  <  /. 
Unfortunately,  we  cannot  construct  a  correct  abstract  sensor  under  these 
conditions;  the  averager  might  be  unlucky  and  each  time  read  a  (temporar¬ 
ily)  incorrect  abstract  sensor.  We  must  also  guarantee  that  there  exists  a 
period  II  such  that  the  number  of  failures  in  all  time  intervals  of  length  II 
is  bounded: 


3n  >  0  :  Vt,*' :  t  <  t'  <  t  +  11 :  |  U^(t')|  <  /• 

If  Algorithm  1  obtains  values  from  each  concrete  sensor  within  II  time  units 
then  it  constructs  a  correct  abstract  sensor.  In  the  limit  of  large  II,  this 
model  reduces  to  the  earlier  arbitrary  failure  model. 

4  Example 

The  methodology  presented  in  this  paper  requires  some  thought  to  use. 
An  original  specification  may  have  to  be  changed  to  accommodate  abstract 
sensors,  and  it  may  be  difficult  to  construct  a  set  of  independent  abstract 
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sensors.  In  this  section,  we  show  an  example  of  how  a  specification  can  be 
converted  from  one  that  uses  physical  state  variables  to  one  using  abstract 
sensors.  We  also  show  how  an  abstract  sensor  can  be  implemented  from  a 
concrete  sensor. 

As  part  of  the  Cornell  Real-Time  Reliable  Distributed.  Systems  (RR) 
project,  we  are  deriving  correct  process  control  programs  from  specifications. 
One  of  the  problems  we  have  chosen  is  that  of  a  train  traversing  a  sequence  of 
n  adjacent  track  segments  of  possibly  unequal  lengths.  Assume  that  segment 
i  spans  track  locations  c,  through  c^+i  where  (Vi :  0  <  i  <  n  :  c,  <  c;+i ).  A 
train  has  position  x(t)  and  velocity  v(t),  has  zero  length5,  starts  at  position 
Co  =  0  and  moves  in  the  direction  of  increasing  x  (towards  Ci).  Each  track 
segment  has  an  associated  minimum  and  maximum  speed  min,  and  max,;  if 
the  train  exceeds  these  limits,  it  may  derail.  Additionally,  there  is  a  random 
communications  delay  associated  with  all  messages  in  the  system  that  is 
bounded  by  6  seconds. 

A  track  circuit  0{q,r)  is  a  concrete  sensor  associated  with  a  span  of  track 
q  <  x  <  r.  A  nonfaulty  track  circuit  returns  true  iff  the  train  occupies  any 
part  of  the  circuit’s  span  at  the  time  the  circuit  is  polled.  We  will  assume 
that  there  are  M  track  circuits. 

The  safety  condition  for  correct  operation  of  the  train  is  that  it  not 
derail,  or 

5  d=  Vt,  i  :  1  <  i  <  n  ;  C{  <  x(t)  <  Cj+i  =>•  mini  <  v(t)  <  max, 

S  is  expressed  in  terms  of  physical  variables,  so  it  must  be  changed  to 
be  expressed  in  terms  of  abstract  sensors.  The  obvious  condition  is 

S'  =f  Vt,t :  1  <  »  <  n  : 

(3x  €  x(t)  :  Ci  <  x  <  c,+i)  =>  (Vv  G  u(t) :  min,  <  v  <  max,) 

since  this  also  excludes  all  unsafe  states  (at  a  penalty  of  running  the  train 
conservatively). 

Since  the  condition  S'  refers  to  the  abstract  sensors  x  and  v,  the  control 
program  will  need  to  refer  to  these  sensors.  We  will  show  how  an  abstract 
position  sensor  Ji  can  be  constructed  from  the  track  circuits  <7(,,r)-  The 
simplest  way  to  do  this  is  to  assume  a  bound  on  the  velocity  of  the  train 
v  <  vmax.  Define  the  global  array  of  M  elements: 

'In  Appendix  C  we  show  that  controlling  a  train  of  length  L  >  0  is  equivalent  to 
controlling  a  train  of  zero  length. 
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var  train[i]:  {before,  in.  after}  :=  before,  ....  before: 

Define  a  polling  process  for  each  track  circuit  Note  the  delay  is 

represented  by  a  delay  statement:  the  implementation  must  ensure  that  no 
more  than  A  seconds  elapse  between  successive  polls  of  a  sensor  where  A  is 
small  enough  so  that  the  polling  process  does  not  “miss"  the  train  traversing 
the  track  segment  it  is  monitoring:  A  <  (r  -  q  -  6vmax)/ vmaz.  Assertion  I 
is  a  loop  invariant,  and  t  is  the  current  time. 

process  Poll[i]  = 

begin 

{I :  train[i]  =  before  =>■  0  <  x(t)  <  q  +  6vmaz  A 
trainfi]  =  in  =>  q  <  x{t)  <  r  +  6vmaz  A 
train[i]  =  after  =>  r  <  x(t)  <  c„} 

do  true  -» 

delay  A; 

if  <r (g  r) A  (train[i]  =  before)  — ►  train[i]  :=  in 
fl  _'°r(<>.r)A  (trainfij  =  in)  — >  train[i]  :=  after 
0  _icr(«?,r)  A  ( train[i]  =  before)  — *•  skip 
[]  cr(<J  r)  a  (train(i)  =  in)  -»  skip 
0  (train[i]  =  after)  — >  skip 
fl 

od; 

end 

The  definition  of  the  abstract  sensor  comes  from  the  loop  invariant  I 
and  the  distance  the  train  could  have  moved  since  the  last  time  <7(9  r)  was 
read: 

=  if  train[i]  =  before  -*>  [0  ..  q  -f  +  A)t?maa.] 
fl  train[ij  =  in  -*  (q  ..  r  +  (6  +  A)t/mai] 
j]  train[i]  =  after  -*  [r  ..  cn] 

fl 

Fault-tolerance  is  achieved  by  constructing  an  abstract  position  sensor  from 
each  track  circuit  and  then  using  Algorithm  1.  Additional  fault- tolerance 
could  be  achieved  by  replicating  the  track  circuit  for  each  track  circuit. 

The  abstract  sensor  developed  here  is  too  simplistic  to  be  of  any  real  use. 
Correct  track  circuits  far  away  from  the  train  give  very  inaccurate  bounds 
on  the  train’s  location,  and  by  Theorem  2  the  accuracy  of  the  fault -tolerant 
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abstract  sensor  will  be  poor  for  any  reasonable  /.  In  the  actual  system, 
we  make  use  of  an  abstract  sensor  x0(t)  whose  value  is  derived  from  the 
initial  condition  x(0)  =  0  and  from  the  commands  sent  to  the  train.  We 
call  this  abstract  sensor  a  model  sensor  since  if  it  is  incorrect,  then  either 
the  control  program  is  faulty  or  the  specification  of  the  environment  was 
incorrect.  The  model  sensor  is  initially  very  accurate,  and  can  be  used 
to  detect  some  of  the  failures  of  the  abstract  sensors  x,.  Having  a  model 
sensor  also  simplifies  the  computation  of  the  other  abstract  sensors.  The 
train  has  the  property  that  if  an  abstract  sensor  Xi  is  computed  from  a 
fixed  set  of  track  circuit  polls  and  the  commands  sent  to  the  train,  then 
the  interval  [x,(t).min  —  xo (t).min  ..  x,(t).max  —  xq (t).maxj  is  a  constant. 
So,  the  implementation  of  1,  computes  an  accurate  value  of  [x,(t).mm  - 
xo (t).min  ..  x,(t).max  -  xo(t).max]  at  the  time  t  it  notes  the  track  circuit 
first  coming  on,  and  computes  ?(t')i  for  t1  >  t  as  [x(t').mmo  +  x,(t).min  - 
xo (t).min  ..  x(t').maxo  +  xt(t).max-xQ(t).max).  The  implementation  of  x, 
can  do  the  same  computation  when  the  track  circuit  subsequently  goes  off. 
and  if  the  two  resulting  values  of  the  abstract  sensor  do  not  intersect  then 
the  abstract  sensor  is  faulty. 

For  our  program,  it  is  necessary  to  ensure  that  that  |x(t)|  <  accx  where 
accx  is  length  of  the  shortest  track  segment.  Given  a  value  of  A,  one  can 
estimate  the  accuracy  of  abstract  sensors  near  the  train,  as  these  will  be 
the  most  accurate.  The  abstract  sensors  x,  have  a  known  bound  on  their 
accuracy,  so  Theorem  1  can  be  used  to  find  the  maximum  value  of  /  that 
will  guarantee  |x(t)|  <  accx. 

5  Discussion 

This  paper  presents  a  five-step  process,  through  which  a  program  written  in 
terms  of  physical  state  variables  can  be  transformed  into  one  that  reads  the 
physical  state  variable  through  a  set  of  concrete  sensors,  some  of  which  may 
be  faulty.  The  degree  of  sensor  replication  depends  on  the  failure  model  be¬ 
ing  assumed.  Figure  10  summarizes  the  maximum  number  of  faulty  sensors 
that  can  be  tolerated  for  the  three  failure  models  considered  in  this  paper, 
assuming  that  an  unboundedly  accurate  sensor  is  desired. 

The  work  presented  here  is  part  of  the  general  problem  of  input  reifica¬ 
tion  [9].  The  results  in  this  paper  are  a  generalization  of  the  work  done  by 
the  author  and  presented  in  [15,14].  This  earlier  work  looked  at  the  problem 
of  clock  synchronization  in  a  distributed  system.  A  clock  is  a  special  kind  of 
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Figure  10:  Maximum  failures  for  different  error  models 


sensor,  in  that  the  physical  process  it  senses  can  be  expressed  simply. 

The  approach  presented  in  Section  2.2  concerning  transforming  specifi¬ 
cations  is  novel.  Much  work  has  been  done  on  expressing  and  determining 
the  validity  of  properties  that  refer  to  real  time  (for  example,  [7,19]),  but 
usually  these  specifications  are  typically  written  in  terms  of  physical  state 
variables  where,  for  each  variable,  an  a  priori  upper  bound  on  its  accuracy 
is  known. 

The  methodology  presented  in  this  paper  is  related  to  the  state  machine 
approach  [18,10].  A  set  of  sensors  of  the  same  physical  value  can  be  thought 
of  as  a  set  of  identical  processors  that  return  intervals  rather  than  scalar 
values.  In  both  cases,  failures  are  masked  by  replication  and  voting. 

Studies  on  hierarchies  of  failure  models  (for  example,  [2,16])  originally 
arose  in  the  context  of  the  agreement  problem  [5];  a  problem  not  addressed 
here.  If  the  control  program  were  to  be  replicated,  then  the  processes  of 
this  program  would  need  to  use  an  agreement  protocol  to  disseminate  the 
sensor’s  values  [3,4,8],  There  has  been  work  on  agreement  on  the  value 
of  sensors.  For  example,  the  inexact  agreement  problem  discussed  in  [13] 
relates  the  accuracy  of  the  agreement  value  with  respect  to  the  number  of 
rounds  the  protocol  executes.  A  different  approach  to  agreement  among 
sensors  is  taken  in  [12],  in  which  sensor  failure  is  not  considered. 

The  methodology  presented  here  is  incomplete.  For  example,  there  are 
other  kinds  of  sensors  than  those  considered  here;  for  example,  discrete 
sensors  like  one  denoting  whether  or  not  a  door  is  open,  or  multivalued 
sensors  like  one  that  returns  the  altitude  and  azimuth  of  an  airplane.  We 
are  extending  the  material  in  this  paper  to  accommodate  these  more  general 
sensors. 
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A  Proofs 

The  four  theorems  in  this  appendix  give  upper  and  lower  bounds  of  |n^„(5)|. 
We  need  the  following  two  definitions: 

Definition  1  If  S  is  a  set  of  intervals,  a  c-clique  of  S  is  a  subset  S’  of  S 
where  |S'|  =  c  and  all  the  intervals  in  S'  mutually  intersect. 

Definition  2  A  set  of  intervals  is  c -reduced  if  each  interval  in  S  is  a  mem¬ 
ber  of  a  c-clique. 

Note  that  a  graph  is  (n-  /)-reduced  if  and  only  if  Algorithm  2  computes 
the  empty  set. 

The  upper  and  lower  bounds  of  |n/>n(5)|  are  as  follows.  Theorem  1  is 
the  same  as  Theorem  3;  it  is  repeated  here  for  clarity: 

Theorem  3  Let  S  be  a  set  consisting  ofn  intervals.  If  0  <  /  <  |_(n  +  1  )/2j 
and  n/in(S)  ^  0,  then  |n/ifl(S)|  <  min2/+i{|s|  :  s  G  5}. 

Theorem  4  Given  a  set  {fi,f2,  ofn  lengthsandn  >  f  >  [(n+l)/2j, 
then  for  any  length  A  >  min{fi,£2). there  exists  a  set  of  n  intervals 
S  =  {?i, J2, ...,3n}  whereVi  :i  <  i  <  n:  |3«|  =  f,  and  |0/ir)(5)|  =  A. 

Theorem  5  LetS  be  a  (n  —  /)  -reduced  set  ofn  intervals.  If  n  >  f  >  [n/2j 
and  nfi„(S)  ^  0,  then  |n/iW(S)|  >  max2(f,_/)_,{|7|  :  3  €  5}. 

Theorem  8  Given  a  set  {/i,{2, of  n  lengths,  an  arbitrarily  small 
length  e,  and  0  <  /  <  |n/2j,  there  exists  a  (n-  /)  -reduced  set  ofn  intervals 
S  =  {5lt  J2,...,3n}  where  Vi  :i  <i  <n:  |3,|  =  and  |n/,n(5)|  =  e. 

Theorem  4  can  be  shown  by  construction.  Let  S  consist  of  the  following 
two  cliques: 
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•  C(  containing  n  —  f  intervals,  where  each  interval  in  this  clique  has 
a  minimum  value  of  u,  and  by  definition  of  A,  a  maximum  value  no 
larger  than  u  -f  A; 

•  C2  containing  /  intervals,  where  each  interval  in  this  clique  has  a 
maximum  value  of  u  +  A,  and  by  definition  of  A.  a  minimum  value  no 
smaller  than  u. 

By  hypothesis,  \n/ 2]  =  f(n  +  1  )/2 J  <  /  <  n,  or  2/  >  2fn/2]  >  n 
and  so  /  >  n  -  f  meaning  both  cliques  are  contained  in  n f,n{S).  So, 
n/in(5)  =  [u  ..  u  +  A]  and  the  theorem  follows.  □ 

Theorem  6  can  also  be  shown  by  construction.  Let  5  consist  of  two 
cliques: 

•  Ci  containing  [n/2j  intervals  such  that  [(n  -  /)/ 2j  intervals  have  a 
maximum  value  of  u+e  and  the  remaining  [n/2j  -  [(n- /)/ 2J  intervals 
have  a  maximum  value  less  than  u; 

•  C2  containing  fn/2]  intervals  such  that  f(n  -  f)/2]  intervals  have  a 
minimum  value  of  u  and  the  remaining  fn/2]  -  f(n  -  /)/ 2]  intervals 
have  a  minimum  value  greater  than  it  -f  e. 

By  hypothesis,  0  <  /  <  [n/2j  or  [n/2j  <  fn/2]  <  n  -  /  <  n,  and  so 
neither  Ci  nor  C2  are  entirely  in  n/  n(5).  However,  n  -  /  intervals  intersect 
over  the  interval  [ti  ..  u  +  e],  and  the  theorem  follows.  □ 

To  prove  theorems  3  and  5,  we  will  need  a  few  lemmas. 

Lemma  1  Let  S  be  a  set  of  n  intervals  where  S  contains  at  least  one  c- 
cliqvte  and  all  c -cliques  in  S  have  exactly  i  intervals  in  common  with  each 
other.  Then,  n  >  c  >  i  and  n  >  2c  -  i. 

Proof:  since  S  contains  at  least  one  c-clique,  we  know  n  >  c.  Fur¬ 
thermore,  since  all  c-cliques  in  5  have  exactly  i  intervals  in  common,  each 
c-clique  must  have  at  least  i  intervals,  or  c  >  i. 

If  c  =  i,  then  the  smallest  graph  satisfying  our  assumptions  is  a  single 
t-clique,  or  n  =  »  =  2c  -  t.  If  c  >  i,  then  5  must  contain  more  than  one 
c-clique,  for  otherwise  the  single  c-clique  has  c  >  » intervals  in  common  with 
itself.  The  smallest  such  set  of  intervals  consists  of  two  c-cliques  sharing  i 
intervals.  Each  clique  has  c-i  intervals  not  in  common  with  each  other,  or 
n  =  i  +  2(c  -  i)  =  2c  -  t.  □ 
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Lemma  2  Let  S  be  a  set  of  n  intervals  where  S  contains  at  least  one  c- 
clique.  If  n  <  2c,  then  all  c  -cliques  in  S  have  at  least  2c  —  n  intervals  m 
common  with  each  other. 

Proof:  by  contradiction.  Suppose  that  all  the  c-cliques  in  5  have  exactly 
i'  intervals  in  common  with  each  other,  where  i'  <  2c  —  n.  By  lemma  1. 
S  contains  at  least  2c  -  i'  intervals,  or  n  >  2c  -  i'.  Rearranging  the  last 
inequality,  we  get  i'  >  2c  —  n,  which  contradicts  our  hypothesis.  □ 

Lemma  3  Let'S  €  S  be  any  member  of  all  maximal  cliques  of  S.  The  cover 
of  the  intersection  of  the  maximal  cliques  is  no  larger  than  |s| . 

Proof:  The  intersection  of  any  maximal  clique  cannot  contain  any  point 
outside  of  S,  since  by  definition  that  point  is  not  in  an  intersection  containing 
s  and  s  is  a  member  of  each  clique.  The  cover  only  adds  points  between  the 
intersections.  Since  S  is  a  set  of  intervals  over  the  reals,  3  must  contain  all 
points  between  the  maximal  cliques,  so  the  cover  does  not  add  any  points  in 
s.  Since  all  the  points  in  the  cover  are  also  in  3,  the  cover  cannot  be  larger 
than  |3|.  □ 

Theorem  3  can  now  be  shown.  From  the  definition  of  n/„(5),  the 
maximal  clique  in  5  must  contain  at  least  n  -  f  intervals,  for  otherwise 
ny  „(5)  =  0.  By  assumption,  f  <  [(n  +  1)/2J  or  n  <  2(n  -  /).  By  lemma  2, 
at  least  n  -  2/  intervals  intersect  all  cliques.  By  lemma  3  the  cover  of  the 
intersection  cannot  be  larger  than  any  of  these  n  -  2/  intervals.  The  cover, 
however,  may  be  larger  than  any  of  the  remaining  2/  intervals.  In  the  worst 
case,  these  remaining  intervals  are  the  smallest  ones  in  S,  and  the  theorem 
follows.  □ 

Lemma  4  Let  S  be  a  c-reduced  set  of  n  intervals,  and  let  the  intervals  s, 
in  S  be  ordered  such  that  min  3,  <  min  s}  if  i  <  j.  Then,  the  intervals 
3i ,  , . . .  3c  form  a  c-clique. 


Proof:  by  induction.  The  lemma  is  trivially  true  for  c  =  1  since  any 
interval  is  by  itself  a  1-clique.  So,  we  assume  the  lemma  holds  for  c  =  k 
and  show  that  it  holds  for  c  =  k  +  1.  Let  S  be  a  (k  +  l)-reduced  set 
of  intervals.  If  a  set  is  ( k  +  l)-reduced  then  it  is  ^-reduced,  so  by  the 
induction  hypothesis  the  intervals  sj,  S2, . . .  Ik  form  a  fc-clique.  If  Jfc+i  does 
not  intersect  some  interval  37  :  1  <  *  <  k,  then  all  intervals  Sj  :  j  >  k  +  1 
also  do  not  intersect  3,,  and  so  3,  is  not  a  member  of  a  (k  +  l)-clique.  This 
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contradicts  our  assumption  that  5  is  (fc  +  l)-reduced,  and  so  sk+ ,  must 
intersect  each  interval  ■ ■ -sk,  and  the  lemma  holds.  □ 

The  same  argument  can  be  used  to  prove  the  following  lemma: 

Lemma  5  Let  S  be  a  c -reduced  set  of  n  intervals,  and  let  the  intervals  s, 
in  S  be  ordered  such  that  maxs,  >  max  sj  if  i  <  j .  Then ,  the  intervals 
SI,S2,...SC  form  a  c-clique. 

Lemma  8  If  S  is  a  (n  -  f)-reduced  set  of  n  intervals,  then 

n/,n(<S)  =  [min„_/+i  {min  s  :  s  £  5}  ..  max„-/+1{max  s  :  s  £  5}] 

Proof:  this  lemma  follows  directly  from  Lemma  4,  Lemma  5  and  the 
definition  of  n/,n(<S).  □ 

Theorem  5  can  now  be  shown.  From  Lemma  6,  all  intervals  intersect 
D/  „(S)  and  there  are  exactly  2(n  - /—  1)  (not  necessarily  distinct)  intervals 
that  extend  outside  of  n/  „(S).  This  means  that  there  are  at  least  n  -  2(n  - 
f-1)  =  2(/+l)-n  intervals  that  are  completely  contained  by  rif  n(S).  So, 
|n/,„(S)|  >  min2(/+1)_n{s  :  s  €  S}  or  |n/t„(S)|  >  max2(„./).,{i  :  s  G  S}. 
□ 

B  Algorithms  lor  Computing  fi/tri(<S) 

This  section  contains  some  algorithms  for  computing  ri/  „(5).  A  set  of  ab¬ 
stract  sensors  are  isomorphic  to  a  class  of  graphs  called  interval  graphs , 
which  in  turn  are  members  of  the  class  of  triangulated  graphs.  Such  graphs 
are  interesting  in  that  many  problems,  such  as  coloring,  clique,  stable  set 
and  clique  cover  can  be  solved  for  triangulated  graphs  in  polynomial  time. 
A  good  reference  on  triangulated  graphs  is  [6],  which  includes  efficient  algo¬ 
rithms  that  solve  the  above  problems. 

The  value  of  n^_„(5)  is  [1  ..  h]  where  l  is  the  smallest  point  contained  in 
n  -  f  intervals  and  h  is  the  largest  point  contained  in  n  -  f  intervals,  and 
where  a  point  x  is  contained  in  an  interval  s  if  and  only  if  min  s  <  x  < 
max  9.  Suppose  that  there  are  a  intervals  s  in  5  such  that  min  s  <  r  and 
that  there  are  6  intervals  s'  in  5  such  that  max  s'  <  x.  Any  interval  not 
counted  in  a  cannot  contain  x,  and  the  intervals  counted  in  b  are  those  that 
were  counted  in  a  but  cannot  contain  x,  so  x  is  contained  in  exactly  a  -  b 
intervals. 
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Let  v  be  an  array  of  2 n  pairs  where  for  each  s,  6  5,  l'2i  =  (min  1) 
and  t’2,+  1  =  (max  s, ,  —  1 ) .  Given  a  point  x, 

a  =  Y. 

At),(2]=l 

and 

»  =  -  E  *Pl 

V»:ui[l]<i 

Au,[2]=-t 

or 

number  of  I  £  S  containing  x  =  a  -  b  =  ^  "42]  +  £”*[2] 

V«:v,[l)<x  Vl:Ui[l]=r 
At),-{2]  =  1 

Computing  the  number  of  intervals  in  5  that  contain  x  can  be  made 
linear  if  v  is  sorted.  Define  tv  <  vj  =  (o.jl)  <  o>[l])  v  (vi[l)  =  v>[l]  A  t\[2]  > 
v_,[2]),  and  let  v'  be  v  sorted  with  respect  to  <.  Then, 

max  ;:k;[1]<ia 

(u^[l]=x)zi’(tjJ[2)=l) 

number  of  s  £  S  containing  x  =  ^  v«[2]  (1) 

i=0 

Recall  that  l  is  the  smallest  point  contained  in  n  -  f  intervals.  Thus,  l  is 
the  smallest  x  that  makes  Equation  1  equal  to  n  -  /,  which  is  where 

} 

low  =  min  j  :  ^  u-[2]  =  (n  -  /) 

i=0 

Similarly,  h  is  the  largest  point  contained  in  n  -  /  intervals,  which  is  the 
largest  x  that  makes  Equation  1  equal  t on-/.  This  point  is  also  the 
maximum  value  of  some  interval  such  that  all  points  greater  than  x  are 
contained  in  no  more  than  n  -  /  -  1  intervals,  or  h  is  i^,ah[l]  where 

j 

high  =  max  j  :  £  r'[2]  =  (n  -  /  -  1) 

1=0 
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Both  low  and  high  can  be  computed  from  v1  in  0(n)  time,  and  r'  can  be 
computed  from  v  in  0(  nlogn)  time,  so  the  overall  running  time  is  0(  n  log  n  ). 

There  are  two  cases  for  which  n^„(S)  can  be  calculated  faster  than 
0(  n  log  n ): 

1.  nn_l  n(5)  is  the  cover  of  5,  or  l  is  the  smallest  minimum  value  of  the 
intervals  and  h  is  the  largest  maximum  value  of  the  intervals.  For  our 
purposes,  however,  this  case  is  not  very  interesting. 

2.  If  all  of  the  intervals  in  5  mutually  intersect,  then  all  of  the  minimum 
values  of  these  intervals  are  less  than  or  equal  to  the  smallest  maximum 
value  of  these  intervals  (this  can  be  tested  for  in  0{n)  time).  Under  this 
condition,  the  array  v'  consists  of  all  of  the  minimum  values  (having 
u,[2]  =  1)  followed  by  the  maximum  values  (having  c,[2]  =  -1).  Thus, 
l  is  the  /  +  1*‘  largest  minimum  and  h  is  the  f+lst  smallest  maximum, 
both  which  can  be  calculated  in  O(n)  time  [1].  If  /  =  0  or  a  fail- 
stop  failure  model  is  assumed,  then  we  are  interested  in  the  value  of 
no,n(*S)i  which  requires  that  all  intervals  mutually  intersect  and  can 
be  calculated  trivially  in  O(n)  time. 


C  Train  Length 

In  the  example  of  Section  4,  we  assumed  the  train  had  zero  length.  This 
is  not  an  unreasonable  assumption,  since  we  can  show  that  for  every  train 
of  length  L  on  a  track  K ,  there  exists  a  track  K'  such  that  a  zero-length 
train  is  constrained  in  exactly  the  same  way  as  the  original  train  on  A'.  In 
this  section,  we  show  how  to  determine  the  track  K'  from  L  and  K.  The 
method  is  an  example  of  transforming  to  configuration  space  [11]. 

A  track  K  is  defined  by  three  sets  (Vi :  1  <  i  <  n  :  {c,}.  {min,},  {max,}) 
where  c<  is  the  location  of  the  end  of  track  segment  i,  min,  is  the  minimum 
allowable  speed  on  segment  i  and  max,  is  the  maximum  allowable  speed  on 
segment  t.  If  the  train  has  length  L  and  the  tail  of  the  train  is  at  x,  the 
safety  condition  is  that  all  parts  of  the  train  satisfy  the  speed  constraints, 
or 


Sl(x,v) 


dtf 


Vx',  i :  x  <  x'  <  x  +  L,  1  <  i  <  n  :  c,  <  x'  <  c^+l  =>  min,  <  v  <  max, 
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Suppose  we  could  find  a  track  K'  :  (Vi  :  1  <  i  <  n'  :  {c'},  {min'},  { max} } ) 
such  that 


5£,(x,  v)  =  (S0(x,u)  =  Vi  :  1  <  i  <  n'  :  c(  <  x  <  c'1+1  =>  min'  <  v  <  max',) 

So  is  the  safety  condition  for  a  zero-length  train  on  track  K'  which  is 
constrained  in  exactly  the  same  way  an  Z-length  train  on  the  original  track 
is  constrained.  If  we  can  find  K'  then  we  can  write  a  program  that  controls 
a  zero- length  train  on  K' ,  and  this  program  will  also  control  the  Z-length 
train  on  K. 

Define  the  two  functions 


Min(Z,  x)  =f  V?  :  x  <  Cj  <  x  +  L  :  max  mirij 
Max(Z,  x)  =f  Vj  :  x  <  Cj  <  x  +  L  :  min  maxj 

These  functions  determine  the  actual  speed  bounds  the  train  must  follow 
when  at  X.  With  them,  Sl  can  be  rewritten  as  Sl  :  Min(Z,i)  <  v  < 
Ma x(L,x). 

We  can  now  find  the  values  of  K'  that  allow  Sl(x,  v )  to  be  rewritten  as 
S0(x,  v).  Both  Min(Z,x)  and  Max(Z,x)  are  piecewise  constant  functions,  so 
we  can  define  the  track  segments  of  K'  to  be  the  spans  where  both  Min(  Z,  x) 
and  Max(Z,x)  are  constant.  Let  be  the  union  of  the  points  of  inflection 
of  Min(X,x)  and  Max(Z,x),  and  let 


min'i  =  lim  Min( Z,c,-  +  <5) 

«-+ o 

max}  =  lim  Max( Z,c,  -M) 

4— +o 

Figure  11  shows  an  example  of  K'  given  K  and  Z.  Each  track  segment  is 
drawn  with  the  maximum  speed  above  the  segment  and  the  minimum  speed 
below  the  segment.  Note  K'  is  shorter  than  K  by  Z,  since  the  end  of  the 
train  cannot  traverse  the  whole  length  of  K  without  the  train  leaving  K. 
Here,  K  and  K'  have  the  same  number  of  segments;  in  general,  K'  can  have 
up  to  twice  as  many  segments  as  K. 
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K’ 

0  0  5  5 

Figure  11:  Configuration  Space 
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